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Abstract 

In this article we examined predictive validity of Student Risk Screening Scale for Internalizing 
and Externalizing (SRSS-IE) scores for use with elementary-age students (NV = 4,465) from 14 
elementary schools. Results indicated elementary school students with high levels of risk 
according to fall SRSS-IE scores — especially those with externalizing behaviors — were more 
likely to have lower oral reading fluency scores, lower Measures of Academic Progress (MAP) 
reading scores, more nurse visits, and more days spent in in-school suspension compared to 
students at low risk for externalizing or internalizing behaviors. Educational implications, 
limitations, and future directions are presented. 
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Predictive Validity of Student Risk Screening Scale for Internalizing and Externalizing 
(SRSS-IE) Scores in Elementary Schools 

Throughout the United States, federal, state, and local educational leaders have placed a 
high priority on developing integrated tiered systems of support such as the comprehensive, 
integrated, three-tiered (Ci3T) models of prevention to meet students’ academic, behavioral, and 
social needs (Lane, Kalberg, & Menzies, 2009; McIntosh & Goodman, 2016; Yudin, 2014). 
Such tiered systems offer a cascade of evidence-based strategies, practices, and programs for 
students at each level of prevention: primary (Tier 1) for all, secondary (Tier 2) for some, and 
tertiary (Tier 3) for few (Cook & Tankersley, 2013). The Ci3T model creates a structure for 
preventing the development of learning and behavior challenges from arising and responding 
effectively and efficiently when such challenges do arise (Lane, Oakes, Cantwell, & Royer, 
2016). A keystone feature of tiered systems is data-informed decision making, with academic 
and behavior systematic screening data used in tandem to determine how to assist students for 
whom primary prevention efforts—even when implemented with integrity—are insufficient to meet 
students’ multiple needs (Oakes, Lane, Cox, & Messenger, 2014). 

These models may hold particular benefits for students with emotional and behavioral 
disorders (EBD), a large and diverse group of students who struggle with externalizing (e.g., 
aggressive) and internalizing (e.g., anxious) behaviors. Externalizing behaviors often disrupt the 
learning environment by impeding instructional processes creating challenges not only for the 
student struggling with externalizing behaviors, but also his or her peers and teachers. While 
internalizing behaviors are often more covert and less apt to negatively impact the learning 
environment, they are no less serious as they adversely affect interpersonal relationships and 
academic performance (Bradshaw, Buckley, & Ialongo, 2008). Teachers consistently report 
managing challenging behaviors as one of the biggest factors impeding effective teaching (New 
Teacher Project, 2013). Clearly, this is no small challenge. 

Forness, Freeman, Paparella, Kauffman, and Walker (2012) report 20% of school-age 


children and youth demonstrate mild-to-severe EBD. With less than 1% of students typically 
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qualifying for special education services under the category of emotional disturbance as defined 
in the Individuals with Disabilities Education Improvement Act (2004), this leaves the general 
education community largely responsible for meeting the multiple needs of student with and at 
risk for EBD. The Ci3T model may provide the ideal context to meet this formidable charge. It is 
important to explore feasible and effective solutions to support all students—particularly those 
with EBD given long-term negative consequences characteristic of this difficult-to-teach 
population: disengagement, school failure, school dropout, impaired personal relationships, and 
increased need for mental health supports (Maggin, Wehby, Farmer, & Brooks, 2016). 

Fortunately, the Institute of Education Sciences (IES) recognizes potential benefits of 
tiered systems as a mechanism that “provides academic, social, emotional, and behavioral 
support for all students, and provides resources and supports that teachers and other school 
personnel need to support” students with and at risk for learning and behavioral challenges in 
authentic educational settings (IES, 2017, p. 17). In the recent request for application (RFA) 
from IES, research considerations were offered across all topic areas: (a) inquiry on 
comprehensive, integrated frameworks and (b) research to develop and evaluate adaptive 
interventions, including individually tailored interventions to assist students with intensive 
intervention needs. This is but one illustration of how federal agencies such as the United States 
Department of Education (USDOE) have prioritized this type of work, with systematic screening 
a central feature needed to facilitate inquiry in both these key objectives. 

Given the importance of using systematic screening data in conjunction with other data 
collected on all students as part of regular school practices (e.g., academic assessments, 
attendance), it is essential for schools to have access to reliable, valid, and feasible screening 
tools (Lane, Oakes, Ennis, & Royer, 2015). Several behavior screening tools have been 
developed and refined to accurately detect students with and at-risk for EBD (Lane & Walker, 
2015). Examples of such tools include: Behavior Assessment System for Children 3rd Edition: 
Behavioral & Emotional Screening System (BASC-3: BESS; Kamphaus & Reynolds, 2015), 


Social, Academic, and Emotional Behavior Risk Screener© (SAEBRS; Kilgus, Chafouleas, 
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Riley-Tillman, & von der Embse, 2013); Social Skills Improvement System - Performance 
Screening Guide (SSiS-PSG; Elliott & Gresham, 2008); Strengths and Difficulties Questionnaire 
(SDQ; Goodman, 2001); Student Risk Screening Scale (SRSS; Drummond, 1994); Student Risk 
Screening Scale — Internalizing and Externalizing (SRSS-IE; Drummond, 1994; Lane & 
Menzies, 2009); and Systematic Screening for Behavior Disorders (SSBD; Walker, Severson, & 
Feil, 2014). Now, the pivotal question facing many district- and school-site leaders is: Which 
screening tool should we adopt? (Lane, Oakes, Ennis et al., 2015). 

The selection of a behavior screening tool is an important one, guided by a number of 
considerations: facets of behavior challenges to be detected (externalizing and/or internalizing), 
school or grade levels of interest, informant (teacher, parent, and/or student), administration and 
scoring time, associated costs (purchase price, ongoing costs), and availability of intervention 
resources (Lane, Menzies, Oakes, & Kalberg, 2012). In addition to logistical considerations, an 
overarching concern is being certain the screening tool minimizes false negatives (overlooking a 
student who does have the challenge of interest), and, although less of a priority, minimizes false 
positive (indicating a student has the challenge of interest when, in fact, they do not). Ideally, 
educational leaders’ decision-making processes would not be driven primarily by monetary 
considerations. Yet, in light of current fiscal uncertainty of educational funding, monetary 
concerns are a pragmatic consideration. For some schools, free-access screening tools such as the 
SDQ, SRSS, and SRSS-IE may be the only realistic options (Lane et al., 2017). 

Given these realities, it is imperative for the research community to explore psychometric 
proprieties of all screening tools—-commercially available and free-access. This inquiry is not 
conducted to “prove” any one screening tool is the best option, but to offer the practitioner 
community the full scope of information necessary to inform decision-making processes when 
selecting a screening tool to detect students with and at risk for social, emotional, and behavioral 
challenges. The National Center on Intensive Intervention (NCII) established the Behavior 
Screening technical review committee (TRC) to address this charge. In partnership with the 


Academic Screening and Progress Monitoring TRC groups, the following definition of screening 
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was established: “a process using tools with convincing evidence of classification accuracy, 
reliability, and validity to identify students who may require intensive intervention efforts to 
meet their academic, social, emotional and/or behavioral needs” (NCH, 2017). 

Responding to (a) the need for rigorous inquiry regarding the classification, reliability, 
and validity of existing tools and (b) the fact many school systems may need to move forward 
with free access screening tools due to fiscal challenges, we conducted this predictive validity 
study of SRSS-IE scores in predicting important educational outcomes. We begin by briefly 
examining psychometric properties of the SRSS developed by Drummond (1994), followed by a 
discussion of research conducted on the SRSS-IE—an expanded tool designed to broaden the 
scope of the SRSS to detect students with internalizing as well as externalizing behaviors. 
Psychometric Properties of the SRSS and SRSS-IE 

Several psychometric studies of SRSS scores have examined the utility of this 7-item 
universal screening tool initially designed to detect students with antisocial tendencies. At the 
elementary level, teachers independently rate each student in their homeroom class using a 4- 
point Likert-type scale (never = 0, occasionally = 1, sometimes = 2, frequently = 3). Items 
include: (1) steal; (2) lie, cheat, sneak; (3) behavior problem; (4) peer rejection; (5) low academic 
achievement; (6) negative attitude; and (7) aggressive behavior. A composite score is created by 
summing item level data (range: 0 to 21), with scores used to place students into one of three 
categories: low (0 - 3), moderate (4 - 8), or high (9 - 21) risk (Drummond, 1994). Studies offer 
evidence of score reliability and validity at the elementary level as evidenced by strong internal 
consistency and additional evidence that fall SRSS scores predicted year-end office discipline 
referral (ODR) rates and spring oral reading fluency (ORF) scores (Menzies & Lane, 2012; 
Oakes, Wilder, et al., 2010). In addition, studies offered evidence of convergent validity between 
SRSS scores and SSBD (Lane, Kalberg, Lambert, Crnobori, & Bruhn, 2010; Lane, Little, et al., 
2009) and SSiS-PSG scores (Lane, Richards-Tutor, Oakes, & Conner, 2014). Both the SSBD and 
SSiS-PSG are established, easy-to-use, commercially available behavior screening tools, with the 


SSiS offering additional tools such as behavior ratings scales and intervention guides. 
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Noting potential benefits of the feasibility of the SRSS that requires about 15 minutes to screen 
an entire homeroom class and the initial evidence of classification accuracy, reliability, and 
validity, Lane and Menzies (2009) modified the SRSS to add items expanding the scope of the 
tool to detect students with internalizing behaviors. This new tool was called the SRSS-IE. 

Lane, Oakes, Harris et al. (2012) first examined the psychometric properties of the SRSS- 
IE with a sample 2,460 elementary students from California and Arizona. Five of the initially 
proposed items to detect internalizing behaviors were retained. These internalizing items 
included: (1) emotionally flat; (2) shy, withdrawn; (3) sad, depressed; (4) anxious; and (5) 
lonely. The SRSS-IE with all 12 items is rated using the same Likert-type scale introduced by 
Drummond (1994). In addition to offering initial evidence of reliability, results offered initial 
evidence of the convergent validity between SRSS-IE scores and SSBD and SDQ scores. Lane, 
Menzies, Oakes, Lambert et al. (2012) conducted two replication studies, examining 
psychometric properties of SRSS-IE scores with students in rural (NV = 982) and urban (NV = 
1,079) districts. Results provided additional evidence of reliability, with the same five items 
retained and additional evidence of convergent validity between SRSS-IE and SSBD scores. 
Collectively, results supported the utility of SRSS-IE in detecting student with externalizing 
(SRSS-E7) and internalizing (SRSS-I5) behaviors in a similar fashion to the SSBD. 

To further explore convergent validity, Lane, Oakes, Common et al. (2015) conducted a 
convergent validity study comparing SRSS-IE and SSiS-PSG scores with a sample of 458 K-5 
students from one school in a southeastern state. Correlation analyses indicated statistically 
significant inverse relations between SRSS-IE (SRSS-E7 and SRSS-IS5 subscale scores as well as 
the total score) and SSiS-PSG subscale scores. ROC analyses comparing scores from students 
with significant difficulty (highest level of risk) to those making adequate progress (typical 
performance) indicated SRSS-IE scores were comparable to SSiS-PSG in detecting Prosocial 
Behavior (area under the curve [AUC] = .972) and Motivation to Learn (AUC = .904). As 
expected, SRSS-IE scores were less accurate than SSiS-PSG scores in detecting academic risk as 


that is not the intent of the SRSS-IE (Math Skills AUC = .817; Reading Skills AUC = .805). 
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Lane, Oakes, Ennis et al. (2015) conducted a replication study, with results offering comparable 
findings with a larger sample of 1,680 K-6 students from three schools in a northeastern state. 

Collectively, these studies offered evidence of reliability and validity of the SRSS-IE 
scores, with comparable accuracy to the SSBD, SDQ, and SSiS-PSG scores in detecting students 
with externalizing and internalizing behaviors. A next important step in this line of inquiry is to 
explore predictive validity of SRSS-IE scores. Predictive validity refers to the degree to which a 
score on a scale (e.g., SRSS-E7 or SRSS-I5) predicts scores on a given criterion measure (e.g., 
number of in-school suspensions or the number of nurse visits, each indicating time away from 
instruction). As mentioned, fall SRSS (now called SRSS-E7) scores predict a range of academic 
and behavioral outcomes (e.g., ORF scores, ODR rates). We now intend to replicate predictive 
validity studies of fall SRSS-E7 scores and explore predictive validity of fall SRSS-I5 scores. 
Purpose 

In this study, we provide initial evidence to support the utility of SRSS-IE scores for use 
with elementary students and explore predictive validity of the original SRSS scores. We 
examined predictive validity of fall SRSS-IE scores by analyzing the degree to which K-5 
students with low, moderate, and high risk for externalizing and internalizing behaviors could be 
differentiated on behavioral and academic characteristics according to extant schoolwide data. 
We conducted this study to replicate previous inquiry establishing predictive validity of SRSS- 
E7 scores measuring externalizing behaviors (Menzies & Lane, 2012; Oakes et al., 2010) and 
examine predictive validity of SRSS-I5 scores measuring internalizing behaviors as applied at 
the elementary level. We examined ORF as measured by AIMSweb scores, Measures of 
Academic Progress (MAP), number of nurse visits (as frequent visits could signal a range of 
concerns), and in-school suspensions. We hypothesized SRSS-E7 and SRSS-I5 scores would be 
more reflective of behavioral rather than academic outcomes given the former sets of variables 
are more indicative of constructs measured using the SRSS-IE behavior screening tool (Lane, 
Oakes, Ennis et al., 2015; Lane, Richards-Tutor et al., 2014). 


Method 
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Participants and Setting 

Participants were 4,465 kindergarten through fifth-grade students (2,360 males) attending 
one of 14 Midwest elementary schools rated by their homeroom teachers (7 = 219) on the SRSS- 
IE. Students were predominantly White (72.81%, n = 3,251), with approximately 16.84% 
receiving special education services (see Table 1). Economic disadvantage rates varied (10.4 - 
68.7%), with the majority being Title 1 eligible (see Table 2). Schools participating in this study 
were in the first year of a researcher-practitioner partnership grant funded by IES focused in the 
implementation and evaluation of Ci3T models of prevention. 
Procedures 

Ci3T Leadership Teams consisting of the principal, two general education teachers, one 
special education teacher, an individual with expertise in school-based interventions (e.g., 
instructional coach, social worker, school psychologist, or behavior specialist), a parent, and a 
student attended a year-long training series led by university partners to develop a Ci3T model of 
prevention. As part of the a Ci3T professional learning series, Ci3T Leadership Teams from each 
of 14 elementary schools reviewed current psychometric evidence on existing screening tools 
and listed the top three behavior screening tools of interest. District leaders collaborated with 
Ci3T Trainers to compile these lists, obtain additional information regarding these tools (e.g., 
procedures for administering, scoring, and interpreting; cost; personnel time), and present 
information acquired to their district principal leadership team. The district principal leadership 
team selected the SRSS-IE as this free-access tool took limited teacher time to complete, had 
established psychometric evidence (e.g., convergent validity with other screening tools, predicted 
important school outcomes, strong internal consistency), and could be built and maintained at no 
charge in their existing district database management system. 

According to the district assessment schedule, homeroom teachers independently 
completed the SRSS-IE three times per year: fall (4-6 weeks after the year began), winter (prior 
to winter break), and spring (before year end). Data from systematic screenings were used by the 


Ci3T District Leadership Team and Ci3T Leadership Teams from each school to (a) examine 
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overall level of risk school-wide; (b) inform the use of low-intensity, teacher-delivered supports 
to increase engagement and decrease disruption; and (c) connect students with Tier 2 and Tier 3 
supports as needed (Lane, Kalberg, & Menzies, 2009). Prior to completing the SRSS-IE, teachers 
received information from their school’s Ci3T Leadership Team on the rationale and logistics for 
completing screening tools. Professional learning opportunities were available for faculty and 
staff through a range of avenues including after school presentations by Ci3T Trainers, district- 
wide presentations, on-demand resources (e.g., YouTube videos), and practice guides (Lane, 
Carter, Jenkins, Magill, & Germer, 2015). 

The 14 elementary schools, with leadership from each site’s Ci3T Leadership Team, 
began systematic screening for students’ academic and behavior performance using AIMSweb 
and the SRSS-IE during the 2014-2015 academic year. Data presented in this study are from the 
2015-2016 academic year, the elementary schools’ second year of implementing Ci3T and the 
first year the Ci3T District Leadership Team implemented the SRSS-IE districtwide (see Author 
et al., 2017 for results of predictive validity studies of SRSS-IE scores in secondary schools). 

As described in Author et al. (2017), the district developed a secure system for teachers 
to complete the SRSS-IE independently. Students’ names and district identification numbers 
were prepopulated for each elementary teacher’s homeroom approximately 30 days, before each 
screening window opened. Principals were permissioned to view electronic folders prepared for 
each teacher two business days before teachers had electronic access to their individual folder. 
Several principals examined the folder structure for their schools to ensure each teacher had a 
folder and the correct students were prepopulated for each teachers’ class. Teachers had 
electronic access to only their homeroom class. As part of their professional learning on 
systematic screening, teachers learned only two total scores, SRSS-E7 and SRSS-I5, would be 
used for decision making and not item-level data. 

The Ci3T District Leadership Team provided de-identified student-level data 
electronically with principal investigators following Institutional Review Board and district 


approved study procedures. In this paper, we report results of fall 2015 elementary screening 
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data in predicting spring 2016 year-end outcomes: ORF scores (grades 1-5), MAP (grades 1-5), 
nurse visits (grades K-5), and in-school suspensions (grades K-5). 
Measures 

Student Risk Screening Scale - Internalizing and Externalizing (SRSS-IE). The 
SRSS-IE is an efficient, free-access screening tool. Initial items included steal; lie, cheat, sneak; 
behavior problems; peer rejection; low academic achievement; negative attitude; and aggressive 
behavior. Results of a series of psychometric studies yielded five additional items to assess risk 
for internalizing behaviors: emotionally flat; shy, withdrawn; sad, depressed; anxious; lonely. 
Homeroom teachers completed the SRSS-IE for each student, rating each behavior on a 4-point, 
Likert-type scale developed by Drummond (1994) of never = 0, occasionally = 1, sometimes = 2, 
and frequently = 3. The original seven items were summed to form the SRSS-E7 score, with total 
scores used to place students into one of three risk groups: 0-3 low risk, 4-8 moderate risk, 9-21 
high risk. The new five items were summed to form the SRSS-I5 score, with total scores used to 
place students into one of three risk groups: 0-1 low risk, 2-3 moderate risk, 4-15 high risk 
(Lane, Oakes, Swogger et al., 2015). In this study, we used cut scores for SRSS-E7 and SRSS-I5 
subscale scores established respectively by Drummond (1994) and Lane, Oakes, Swogger et al. 
(2015) to examine predictive validity. 

Extant school-wide data. We predicted year-end outcomes: spring ORF scores, MAP, 
nurse visits, and in-school suspensions. Consistent with procedures described by Lane et al., 
(2017), district leaders provided de-identified year-end data electronically to principal 
investigators. ORF referred to the students’ spring benchmark AIMSweb scores ( number of 
words read correct per min). MAP referred to students’ spring reading assessment percentile 
scores. Nurse visits referred to the number of visits a student made to the school-nurse for 
assistance (e.g., getting a bandage, nausea, fever, somatic complaints). In-school suspensions 
referred to the number of days a student was assigned in-school suspension (a sanction reserved 
for serious rule infractions such as bullying). Each Ci3T Leadership Team developed a 


schoolwide reactive plan as part of their Ci3T model of prevention, listing and defining 
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behaviors warranting an in-school suspension. Upon receiving data, project staff conducted a 
series of logic checks to ensure data received reflected accurate ranges. 
Statistical Analysis 

Students were grouped into low-, moderate-, and high-risk levels as formerly described. 
Next, we examined potential group differences in ORF, MAP, nurse visits, and in-school 
suspensions. To explore potential differences in academic performance (ORF and MAP) by 
group, we fit a mixed model analysis of variance (ANOVA) with group as a fixed effect and 
classroom teacher as a random effect to account for the nested nature of the data (students nested 
in teachers’ classes; Lane et al., 2017). This model allowed us to examine the extent to which 
students scoring in the low, moderate, and high-risk categories according to fall SRSS-E7 and 
SRSS-I5 scores could be distinguished on spring ORF and MAP scores. Significant group effects 
were followed up with a set of pairwise comparisons (4 = 3 comparisons: low vs. high, low vs. 
moderate, and moderate vs. high). We used a Bonferroni correction to adjust the Type I error- 
rate for post-hoc tests, with the alpha-level for each group comparison set at 0.0167 (.05/3). 

Nurse visits and in-school suspensions were measured as counts. For these dependent 
variables, we computed a series of random effects negative binomial regressions with an 
overdispersion parameter. These models account for the nested nature of the data (students 
nested in teachers’ classes) when examining the degree to which students in low-, moderate-, and 
high-risk groups according SRSS-E7 and SRSS-I5 scores could be differentiated on these year- 
end outcomes. As explained in Lane, Oakes, Swogger et al. (2015), we fit negative binomial 
regression models for these outcome variables given their respective distributions closely 
resemble a Poisson distribution (commonly seen in count variables), with many students in the 
sample receiving zeros (e.g., zero nurse visits, zero in-school suspensions). The negative 
binomial regression model is useful when dependent variables are distributed as count data and 
standard deviation exceeds the mean count (as with data presented here). 

Analyses were computed using available data. While missing data were not imputed, 


missingness was managed using full maximum likelihood estimation for the mixed model 


PREDICTIVE VALIDITY SRSS-IE ELEMENTARY 12 


ANOVAs and negative binomial regressions (Enders, 2010). 

We calculated effect sizes from observed means and standard deviations to determine the 
magnitude of differences between groups. We used Hedges’ g formula, which incorporates the 
pooled standard deviation in the denominator and accounts for differences in the number of cases 
between groups. Effect sizes were interpreted per the following criteria: small- (0.20), medium- 
(0.50) and large-magnitude effects (0.80; Cohen, 1988). 

Results 
Externalizing: SRSS-E7 

Findings of a mixed model ANOVA with group as the between-subjects fixed effect and 
teacher as the random effect indicated a group effect for ORF, F(2, 620) = 39.51, p < .0001 (R? = 
11%). The low-risk externalizing group earned statistically significantly higher ORF scores than 
moderate- (Mean difference = 24.62, 95% confidence limits: 16.05 to 33.18, t= 5.64, p < .0001, 
Hedges’ g = 0.61) and high-risk groups (Mean difference =47.41, 95% confidence limits: 35.06 
to 59.76, t= 7.54, p < .0001, Hedges’ g = 1.18). The moderate-risk group had a statistically 
significantly higher mean ORF than the high-risk group (Mean difference =22.80, 95% 
confidence limits: 8.70 to 36.89, t = 3.18, p = .0016, Hedges’ g = 0.52; 95% Confidence Limits: 
8.70 to 36.89; see Table 3). Please see Table 4 for Pearson correlation coefficients 

Findings of a mixed model ANOVA with group as the between-subjects fixed effect and 
teacher as the random effect indicated a group effect for MAP reading percentile scores, F(2, 
2688) = 238.56, p < .0001 (R? = 15%). The low-risk externalizing group earned statistically 
significantly higher MAP scores than moderate- (Mean difference = 23.63, 95% confidence 
limits: 20.81 to 26.45, t= 16.45, p < .0001, Hedges’ g = 0.87) and high-risk groups (Mean 
difference = 33.22, 95% confidence limits: 29.23 to 37.22, t= 16.32, p < .0001, Hedges’ g = 
1.24). The moderate-risk group had a statistically significantly higher mean MAP reading 
percentile score than the high-risk group (Mean difference = 9.60, 95% confidence limits: 5.01 to 
14.18, t= 4.10, p < .0001, Hedges’ g = 0.32). 


For number of nurse visits, we fit a random-effects negative binomial regression model. 
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The model demonstrated a significant overall omnibus test, F(2, 4244) = 123.20, p < .0001. Post- 
hoc comparisons revealed the low-risk for externalizing group experienced significantly fewer 
nurse visits than moderate- (Mean difference = -0.40, 95% confidence limits: -0.47 to -0.32, t = - 
10.66 p < .0001, Hedges’ g = 0.41) and high-risk groups (Mean difference = -0.68, 95% 
confidence limits: -0.78 to -0.58, t= -13.28, p< .0001, Hedges’ g = 0.79). The moderate-risk 
group had statistically higher mean scores for nurse visit compared to the high-risk group for 
externalizing behaviors (Mean difference = -0.28, 95% confidence limits: -0.39 to -0.17, t= - 
4.91, p< .0001, Hedges’ g = 0.27). 

For number of in-school suspensions, we fit a random-effects negative binomial 
regression model. The model demonstrated a significant overall omnibus test, F(2, 4,244) = 
41.44, p < .0001. Post-hoc comparisons revealed the low-risk for externalizing group 
experienced significantly fewer in-school suspensions than moderate- (Mean difference = -2.18, 
95% confidence limits: -2.84 to -1.52, t= -6.48, p < .0001, Hedges’ g = 0.23) and high-risk 
groups (Mean difference = -3.27, 95% confidence limits: -3.99 to -2.56, t = -8.94, p < .0001, 
Hedges’ g = 0.63). Furthermore, students in the moderate-risk group earned fewer in-school 
suspensions than students in the high-risk groups (Mean difference = -1.10, 95% confidence 
limits: -1.74 to -0.46, t = -3.35 p = .0008, Hedges’ g = 0.20). 

Internalizing: SRSS-I5 

Findings of a mixed model ANOVA with group as the between-subjects fixed effect and 
teacher as the random effect indicated a group effect for ORF, F(2, 620) = 7.57, p = .0006 (R? = 
2%). The low-risk internalizing group earned statistically significantly higher ORF scores than 
the high-risk group (Mean difference = 19.87, 95% confidence limits: 9.36 to 30.37, t= 3.71, p= 
.0002, Hedges’ g = 0.47). There were no statistically significant differences in ORF scores 
between low- and moderate- or between moderate- and high-risk groups (see Table 3). 

Findings of a mixed model ANOVA with group as the between-subjects fixed effect and 
teacher as the random effect indicated a group effect for MAP reading percentile scores, F(2, 


2,688) = 63.74, p < .0001 (R* = 5%). The low-risk internalizing group earned statistically 
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significantly higher MAP scores than moderate- (Mean difference = 9.45, 95% confidence limits: 
6.18 to 12.72, t= 5.67, p < .0001, Hedges’ g = 0.33) and high-risk groups (Mean difference = 
19.81, 95% confidence limits: 16.08 to 23.54, t =10.41, p < .0001, Hedges’ g = 0.69). The 
moderate-risk group had a statistically significantly higher mean MAP reading percentile score 
than the high-risk group (Mean difference =10.36, 95% confidence limits: 5.72 to 14.99, t= 
4.38, p< .0001, Hedges’ g = 0.33). 

For number of nurse visits, we fit a random-effects negative binomial regression model. 
The model demonstrated a significant overall omnibus test, F(2, 4,244) = 17.14, p < .0001. Post- 
hoc comparisons revealed the low-risk for internalizing group experienced significantly fewer 
nurse visits than moderate- (Mean difference = -0.11, 95% confidence limits: -0.19 to -0.02, t= - 
2.54 p = .0111, Hedges’ g = 0.10) and high-risk groups (Mean difference = -0.29, 95% 
confidence limits: -0.39 to -0.19, t= -5.64, p < .0001, Hedges’ g = 0.32). The moderate-risk 
group had statistically higher mean scores for nurse visit compared to the high-risk group for 
internalizing behaviors (Mean difference = -0.18, 95% confidence limits: -0.30 to -0.06, t= - 
2.94, p = .0033, Hedges’ g = 0.19). 

For number of in-school suspensions, we fit a random-effects negative binomial 
regression model. The model demonstrated a significant overall omnibus test, F(2, 4,244) = 9.35, 
p < .0001. Post-hoc comparisons revealed the low-risk for internalizing group experienced 
significantly fewer in-school suspensions than moderate- (Mean difference = -1.36, 95% 
confidence limits: -2.02 to -0.70, t = -4.05, p < .0001, Hedges’ g = 0.20) and high-risk groups 
(Mean difference = -1.08, 95% confidence limits: -1.91 to -0.25, t =-2.55, p = .0109, Hedges’ g = 
0.13). There were no statistically significant differences in mean in-school suspension scores 
between moderate- and high-risk groups. 

Discussion 

Psychometric studies of SRSS-IE scores offer evidence of reliability and validity, 

including results suggesting comparable accuracy between SRSS-IE scores and other validated 


screening tools’ scores (e.g., SSBD, SDQ, SSiS-PSG) in detecting students with externalizing 
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and internalizing behaviors. We conducted the current study to explore predictive validity of 
SRSS-IE scores, an essential next step in the systematic line of inquiry establishing the SRSS-IE 
as a psychometrically sound, feasible tool for school use. 

Findings offer additional evidence of fall SRSS-E7 scores predicting behavioral and 
academic year-end outcomes for elementary-age students consistent with early inquiry of the 
original SRSS tool (e.g., Menzies & Lane, 2012; Oakes et al., 2010). Furthermore, we provide 
initial evidence suggesting fall SRSS-I5 scores also predict important educational outcomes for 
students. 

Predictive Validity of Externalizing Scores in Elementary Schools 

In predicting year 1 outcomes at the elementary school level, fall SRSS-E7 scores 
differentiated low-, moderate-, and high-risk groups on ORF, MAP reading, nurse visits, and in- 
school suspensions. The low-risk group had statistically significantly higher year-end ORF and 
MAP scores as well as fewer nurse visits, and in-school suspensions compared to moderate- and 
high-risk groups. Furthermore, moderate- and high-risk group mean scores could also be 
differentiated, with students in the high-risk group having the most negative outcomes. 

Findings were highly similar to previous short-term (one year) predictive validity studies 
indicating fall SRSS-E7 scores predicted yearend behavioral outcomes (e.g., ODRs and self- 
control skills; Menzies & Lane, 2012; Oakes et al., 2010), as well as academic outcomes, such as 
ORF (Oakes et al., 2010) and proficiency in language art skills (Menzies & Lane, 2012). Yet, 
this is the first study at the elementary level to explore the extent to which screening scores 
predicted in-school suspensions and nurse visits. We learned fall SRSS-E7 scores differentiated 
students in the low-, moderate-, and high-risk for externalizing behaviors with students in the 
low-risk group having fewer nurse visits and fewer days spent in in-school suspension than 
students in the moderate- and high-risk groups. Also, students in the moderate-risk group had 
fewer nurse visits and fewer days spent in in school suspension than students in the high-risk 
group. Given this is the first study examining these variable, the information should be 


considered preliminary until these findings are replicated to be certain results are not spurious 
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(Cook, 2014). However, a recent psychometric study of the SRSS-IE in middle and high schools 
suggested fall SRSS-E7 scores could differentiate secondary school nurse visits (Lane et al., 
2017). As discussed in that article, future inquiry into nurse visits is warranted as frequent visits 
to the nurse could signal a range of concerns. For example, it may be students use visits to the 
nurse’s office to manage issues such as anxiety (e.g., panic attacks), the need to escape too 
difficult or too easy instruction, somatic complaints, or a desire to seek solace or attention from 
an adult in a helping profession. Students involved in physical aggression and altercations might 
make repeated visits as well. Conversely, other students may visit the nurse for medication 
management, In short, frequent trips to the nurse can indicate a range of needs, which will vary 
from student to student (Johnson & Hutcherson, 2006; Vernberg, Nelson, Fonagy, & Twemlow, 
2011). The same is true for in-school suspensions (although the base rate was very low). 
Behaviors leading to in-school suspension may indicate an unmet need and offering students 
additional, and proactive, supports (often in the form of evidence-based Tier 2 and 3 supports), 
may reduce nurse visits or suspensions. Limiting nurse visits to those with a medical need would 
enable nurses to better manage the health needs of students and reduce overall burden. For 
example, the National Association of School Nurses recommends a nurse-to-student ratio of 
1:750 in a healthy context and a lower ratio in contexts in which students have more nuanced 
health needs. This recommended ratio is frequently not obtained due to shortages of school 
nurses and/or fiscal challenges (American Association of Colleges of Nursing, 2014; Holmes et 
al., 2016). 

In examining effect sizes, results indicated medium-to-large magnitude effects when 
differentiating low- and high-risk groups on ORF (1.18), MAP (1.24), nurse visits (0.79), and in- 
school suspensions (0.63). Effect sizes were medium-to-large when differentiating ORF and 
MAP scores between low- and moderate-risk groups (.61 and .87, respectively); yet small-to- 
moderate for nurse visits (.41) and in-school suspensions (.23). Collectively, results suggest fall 
SRSS-E7 scores continue to be an effective screening tool for predicting behavioral and 


academic outcomes for elementary-age students. 
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Predictive Validity of Internalizing Scores in Elementary Schools 

In this first predictive validity study of SRSS-I5 scores in elementary schools, results 
suggested kindergarten through fifth-grade students at low, moderate, and high risk for 
internalizing behaviors could also be differentiated on MAP reading and nurse visits for all three 
risk groups as was the case with fall SRSS-E7 scores. Students in the low-risk category had 
higher mean MAP scores and fewer mean nurse visits than students in the moderate- and high- 
risk groups. Students in the moderate-risk group had higher mean MAP scores and fewer mean 
nurse visits than students in the high-risk groups. The distinction between the three groups was 
very clear on these two variables as was the case with fall SRSS-E7 scores, yet the magnitude of 
these differences was smaller. 

For ORF and in-school suspensions, the low-risk group could be differentiated from the 
high-risk group in each case (with the low-risk group experiencing more favorable outcomes). 
While the low-risk group also had fewer mean in-school suspensions, ORF scores did not 
distinguish between low- and moderate- or moderate- and high-risk groups. Students in the 
moderate risk group had mean scores similar to those in the low and high-risk group, thus 
students with internalizing concerns may not demonstrate detectable differences in reading 
progress until the internalizing concerns reach the criteria for high risk. These findings highlight 
the complexity of the internalizing behaviors and school outcomes for students with and at risk 
for internalizing behaviors. Internalizing behaviors reflect a broad array of more covert 
behavioral manifestations such as anxiety, social withdrawal, and depression (Bradshaw et al., 
2008; Green et al., 2016). As discussed for several decades, students with strong interpersonal 
skills are able to interact comfortably with a range of individuals: peers, teachers, parents, and 
other authority figures (Rapport, Denney, Chung, & Hustace, 2001; Walker et al., 2014). Yet, 
students who experience internalizing behaviors often struggle in these important relationships, 
making school engagement challenging, and academic outcomes may be affected for those with 
the highest levels of teacher-rated risk. 


Given the lack of clear distinction between moderate- and high-risk groups on several 
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variables, it will be important to ensure any indication of risk for internalizing behaviors be 
attended to swiftly. Educators may consider the use of low-intensity teacher supports (e.g., 
intentional use of behavior-specific praise to acknowledge and engage students) or more targeted 
supports according to students’ needs (Lane et al., 2017). 

When considering effect sizes, it should be noted the magnitude of differences between 
low- and high-risk internalizing groups were lower than the differences between externalizing 
groups for all variables. These small-magnitude differences between groups on in-school 
suspensions were expected. This finding was comparable with results reported by McIntosh, 
Campbell, Carter, and Zumbo (2009) whose inquiry also suggested students with internalizing 
concerns are not adequately detected through reactive procedures such as ODRs that may result 
in in-school suspensions at the elementary level. 

As discussed, replication is essential before drawing a definitive conclusion regarding the 
predictive validity of SRSS-I5 scores in predicting academic and behavioral outcomes. In the 
interim, we urge caution as replication is needed before these results are generalized to other 
locales and with students from other contexts (e.g., more diverse backgrounds). At the same 
time, it would not be wise to prematurely conclude risk is simply a dichotomous variable (low 
vs. any risk) when examining outcomes. Fall SRSS-I5 scores may suggest any level of risk at the 
onset of an academic year is cause for concern and may warrant additional consideration or 
support depending on the breadth of concern (Lane, Oakes, Ennis et al., 2015; Walker et al., 
2014). 

Educational Implications 

We are pleased to offer findings from this researcher-practitioner partnership as 
additional information on the utility of SRSS-E7 scores and the preliminary nature of the utility 
of SRSS-IS5 scores. Results suggested SRSS-IE scores are useful for distinguishing elementary 
students in the low-risk group from students in the moderate- and high-risk groups on most 
academic and behavioral variables examined in this study. Although often difficult to detect, the 


fall SRSS-I5 scores did distinguish — at a minimum — between students in low- and high-risk 
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groups on all variables. 

With the recommendation that systematic screenings take place three times each year 
(fall, winter, and spring), it will be important to explore the degree to which additional time with 
students (e.g., winter and/or spring scores) will increase predictive validity of SRSS-E7 and 
SRSS-I5 scores collected at later time points. Studies of SRSS-E7 scores suggest this is indeed 
the case at the high school level (Lane, Oakes, Ennis, Cox, Schatschneider, & Lambert, 2013) 
and we hypothesize this will be the case at the elementary level. A key consideration is the need 
to determine if winter and spring internalizing scores are more accurate in predicting student 
outcomes one year later. Given the host of negative outcomes of these difficult to detect and 
often more covert behaviors, this is a key point for future inquiry. 

At this time we encourage school leadership teams and individual teachers to move 
forward cautiously when utilizing fall screening scores. In optimal conditions, screening scores 
should be examined in conjunction with other reliable, available data to shape instruction. When 
working in tiered systems, multiple sources of data can be used to inform Tier 1 practices, 
teacher-delivered practices, as well as the use of Tier 2 and Tier 3 for students with targeted and 
intensive intervention needs, respectively. For example, in schools in which more than 20% of 
students are placing into moderate- and high-risk categories for externalizing or internalizing 
behaviors, instructional coaches might offer professional learning to all classified and certified 
staff as well as parents in validated strategies such as behavior-specific praise and incorporating 
choice (instructional choice in schools and choice of activities in the home settings; Ennis, 
Royer, Lane, & Dunlap, 2018) as a Tier | practice given the magnitude of the students 
experiencing behavioral risk. Then, at the next screening time point, students still scoring in the 
moderate-risk category despite high-fidelity implementation of Tier | practices might be 
supported with self-management strategies (Carter, Lane, Crnobori, Bruhn, & Oakes, 2011) or 
cognitive restructuring activities as Tier 2 practices (Smith, Taylor, Barnes, & Daunic, 2012). 
Support at each level of prevention will require high-quality professional learning, ideally with 


positive practice, coaching, and performance feedback (Common, Royer, Lane, Leko, & Oakes, 
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2018). 

Just as explained in the recent IES RFA, the intent is to examine the potential benefits of 
tiered systems as a mechanism that “provides academic, social, emotional, and behavioral 
support for all students, and provides resources and supports that teachers and other school 
personnel need to support” students with and at risk for learning and behavioral challenges in 
authentic educational settings (IES, 2017, p. 17). As mentioned, systematic screening efforts will 
play a central role in focusing intervention efforts. In short, one goal will be to empower faculty, 
staff, and parents with the knowledge, skills, and confidence to incorporate positive behavior 
interventions and supports as part of daily activities. It is important to move past the erroneous 
idea that academic, behavior, and social competencies should be addressed in isolation rather 
than simultaneously (McIntosh & Goodman, 2015; Menzies, Lane, Oakes, & Ennis, 2016). 
Results from this study suggest soft-signs of externalizing and internalizing behaviors in the 
elementary years predict important academic and behavioral outcomes. We hope this information 
is useful as we move forward with a comprehensive, integrated approach to meeting students’ 
multiple needs. 

Limitations and Future Directions 

We encourage readers to interpret results with attention to the following limitations. First, 
as with all studies, replication is essential before generalizing findings (Cook, 2014; Travers, 
Cook, Therrien, & Coyne, 2016). Specifically, although this study included a large sample of 
students, they were from one district in one geographical region. As recommended by NCII 
Behavior Screening TRC, it is necessary for additional inquiry in other geographical locales and 
ideally with ethnically and culturally diverse samples. This is particularly important when 
interpreting SRSS-IS5 scores as this is the first study examining predictive validity of SRSS-I5 
scores in predicting elementary students’ academic and behavioral outcomes. 

Second, in this study we analyzed SRSS-E7 and SRSS-I5 scores in isolation. We 
encourage other research teams to explore issues of co-morbidity by examining predictive 


validity of combined subscale score (e.g., total scores) given the fact students often present with 
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externalizing and internalizing behavior patterns, as was the case with the current sample (see 
Table 5). For example, it would be interesting to examine the extent to which students with 
various facets of EBD (e.g., externalizing but not internalizing; internalizing but not 
externalizing; as well as co-occurring challenges) fared over time in schools implementing a 
tiered system of supports. Although not the focus of this study, it would be interesting to note if 
students accessed evidence-based strategies, practices, and programs as Tier 2 and 3 supports 
and how they responded to this extra assistance. This study examined the constructs of 
externalizing and internalizing in isolation without attention to issues of co-morbidity. While this 
was not a goal of this predictive validity study, we encourage other research teams to examine 
the predictive validity of SRSS-IE scores when used in tandem to address comorbidity. For 
example, an important next step in this line of inquiry is to determine the degree to which co- 
occurrence of externalizing and internalizing patterns (e.g., students with intensive intervention 
needs for externalizing and internalizing behaviors) predicts important educational outcomes for 
students (e.g., academic performance as well as behavioral and social performance patterns; 
Lane et al., 2017). 

Third, we encourage replication with larger samples to examine academic outcomes. 
Also, oral reading fluency (ORF) data were available on more than 600 students, this was a very 
small percentage of the current sample. Schools in this sample were in their first year of 
collecting these data and the practice had not yet been taken to scale district-wide. We 
emphasize, all available data collected by the district were analyzed. 

Fourth, the elementary schools in the present study were supported as part of researcher- 
practitioner partnership. The Ci3T District Leadership Team adopted screening as part of their 
district-wide implementation of Ci3T. They built a screening platform managed by the 
instructional technology departments in conjunction with their teaching and learning department. 
Thus, the district certificated employees received on-going professional learning on how to 
conduct and utilize data gleaned from systematic screening to inform instruction. While these 


features are a clear strength, future inquiry is needed to determine if these findings are replicated 
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in other systems where systematic screening is supported without the additional resources of 
research-partnerships (Lane, Oakes, Ennis et al., 2013; Lane et al., 2017) 
Summary 

Despite limitations, results of this psychometric study of SRSS-IE scores offers 
evidenced of predictive validity of SRSS-E7 (measuring externalizing behaviors) and SRSS-I5 
(measuring internalizing behaviors) scores for predicting a range of academic and behavioral 
outcomes for elementary students. Results suggest students with high levels of risk (particularly 
those with externalizing behaviors) as measured by the SRSS-IE at the fall administration time 
point were more apt to have lower ORF scores, and have more nurse visits compared to students 
at low risk according to SRSS-IE scores. We encourage readers to avoid generalization errors 
and use this information cautiously until replication studies confirm these findings (Cook, 2014; 
Travers et al., 2016). Yet, this study presents important findings in this programmatic line of 
inquiry offering evidence that one teacher’s independent rating in the fall can differentiate 
between elementary students with low and high risk for both major disorders of childhood 
(externalizing and internalizing behaviors) on behavioral (proximal) and academic (distal) 


measures. 
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